Model Selection

End-to-end multimodal

# End-to-end multimodal

Qwen2.5 Omni 7B AWQ

Qwen2.5-Omni is an end-to-end multimodal model capable of perceiving multiple modalities including text, images, audio, and video, while generating text and natural speech responses in a streaming manner.

Multimodal Fusion

Transformers English

Qwen2.5 Omni 3B

Qwen2.5-Omni is an end-to-end multimodal model capable of perceiving various modalities including text, images, audio, and video, while synchronously generating text and natural speech responses in a streaming manner.

Multimodal Fusion

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase